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Abstract — This paper proposes a system which will detect 
emotion and mental state of a person by detecting the pose of a 
person, detection of emotion will be based on body parts not on 
facial expressions. Work done on emotion detection is basically 
done on facial expression, according to the psychological study it 
is found that every body part shows an expression. This whole 
detection is based on gestures of human body, we already have 
frameworks for facial expression detection but for body gestures 
we don’t have any framework. Here we study different posture 
of human body and their movement during an interaction. Our 
proposed work includes studying the pose of a person, we study 
the normal body pose first and then estimate a threshold 
according to which we can find the difference between a person 
who is calm and the person who is showing deviation from calm 
attitude. Along with the shoulder movement, we studied the 
hand gestures so that we can get much more promising and 
sound results. By combining both the results for hand and 
shoulder together we will get the approximate picture of the 
person’s state of mind. In the past, there has been similar project 
but their implementation and results were based on facial 
expression. But our approach is based on bodily gestures, and 
we are trying to detect emotions from them. Thus, to detect 
bodily expressions we have written an algorithm which will help 
us to predict the behavior of a person. The experimental result 
shows that by using algorithm we can infer emotions and state of 
mind from human pose, in terms of body gesture including 
shoulder and hand. 

Index Terms — Gestures, Emotions, human poses, Affective 
Computing 


I. INTRODUCTION 

Communication plays a vital role in our daily life; it makes 
us capable to connect and express ourselves with others as 
individuals or as group. Without communication it would be 
really hard to exist, we communicate to others verbally and 
non-verbally. It is the basic need for building and developing 
our relationships, education and work. In human 
communication we extensively use body language, which 
includes: verbal and nonverbal. “Verbal Communication” can 
be referred to the spoken language for conveying messages or 
ideas in our day today life. Whereas, nonverbal 
communication includes communicating through gestures i.e. 
through facial expression, eye contact, body movement and 
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posture, gesture, touch, and etc. The study of emotion has 
moved from psychology to computing, remarked by the book 
of Picard [15], which created a new field called Affective 
Computing. This book provides us with standards and 
ideology to create smart and intelligent emotion detecting 
system. The influence of emotion detection covered a wide 
range including medicine, education, health and at a grater 
extent in Human Computer Interaction [4]. Usually emotion 
detection system takes input either in the form of audio, image 
or visual and then provides appropriate results produced 
according to the input given. Visual emotion detection was 
established by the classical study of Darwin. Using the visual 
detection, Darwin gave us an idea of how emotions can be 
recognized from face and body; it created two broad fields for 
detecting emotions: facial and bodily emotions. Scientists and 
researchers have shown a keen interest to work on facial 
expression. Detection using facial expression has a successful 
history due to the work done by Ekman and Friesen who 
introduced us to Facial Action Coding System (FACS) [7]. 
This system has provided us with the standard and outlines for 
facial emotion detection, and using this system a lot of face 
recognition and facial emotion detection system have been 
evolved. As compared to the facial expressions bodily 
expression were not addressed by the researchers. 



Fig. 1 Structural Model 

Whereas psychology support the bodily expressions by 
providing evidences [19] and nonverbal communication 
gestures[6]. Bodily expressions are important as inferred 
from the evidences and they are as important as facial 
expressions. The evidences provided but the psychology 
made scientist and researchers to ponder over it again and 
work in this area of emotion detection. Body gestures include 
hand movements, shoulder position, torso alignment, head 
position and leg movements. Working alone on the body 
posture (hiding hand and face) we can predict the emotions of 
a person accurately, the results found are accurate as we get it 
form the analysis of facial expression as stated by Walter and 
Walk[19]. 
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Fig. 2 (a) Defensive position; (b) Closed attitude; (c) 
Open attitude 

TABLE 1 


Expressive Elements of Posture. 


Emotion 

Body Posture 

Anger 

Head backward, no abdominal twist, arms 
raised forwards and upwards, shoulders 
lifted. 

Joy 

Head backward, no chest forward, arms 
raised above shoulder and straight at the 
elbow, shoulders lifted. 

Sadness 

Head forward, chest forward, no 
abdominal twist, arms at the side of the 
trunk, collapsed posture. 

Surprise 

Head backward, chest backward, 
abdominal twist, arms raised with straight 
forearms. 

Pride 

Head backward or lightly tilt, expanded 
posture, hands on the hips or raised above 
the head. 

Fear 

Head backward, no abdominal twist, arms 
are raised forwards, shoulders forwards. 

Disgust 

Shoulders forwards, head downwards. 

Boredom 

Collapsed posture, head backwards not 
facing the interlocutor. 


Our major interest is on reading the upper part of the human 
body and correlating the pose with emotion so as to get the 
emotional state of a person. We study the body posture but 
there are factors that makes the task problematic and 
challenging, because there are different traditions different 
attires worn by people, background clutter, and occlusion. 
There are efforts made my researchers to estimate body pose 
by removing the errors [3], [16], [15]. Although these 
methods perform well on certain body parts, e.g., head, their 
performance on localizing parts corresponding to lower arms, 
i.e., elbows and wrists, is poor in general. When we read the 
pose, we combine it with the information provided in Table- 1 
to find the emotions of a person. 


II. RELATED WORK 

Here is a general overview of the existing methods for Body 
Language Analysis. Like facial expression, bodily language 
also expresses their emotion, mood, attitude, and attention. A 
person’s body language also provides other information that 
includes identity, gender, age, attractiveness, and personality. 
One of the most active and current fields in computer vision is 
the analysis of bodily expression in image sequences, 
including body posture analysis and body gesture (including 
gait) analysis. Most of the existing work can be classified as 
model-based or appearance-based. Much progress has been 
made in visual human motion analysis in the last two decades 
[18]. 

Darwin C. (1872) et al. he studied how we express our 
thoughts and feelings through emotions. According to 
Darwin’s our expressions of emotion express our thoughts 
more than words. He stated that our emotions are intricately 
intertwined with our whole body. Our emotions, mind, and 
body work as one to send signals to other people. Ekman 
(1969) et al. identified five characteristics of how our bodies 
communicate through movement stated below [8], [9]. 
Barclay (1978) et al. carried out further study by examining 
temporal and spatial factors. They suggested that successful 
gender recognition requires exposure to approximately two 
walking cycles, and the rendering speed has a strong influence 
over recognition [2]. Zuckerman (1981) et al. stated that a 
single behaviour will not tell about or give any evidence of 
lies, deception while threatening cause increase in tension in 
an individual leading to certain nervous behaviours. Ekman 
(1982) et al. stated that one of the key areas in honest 
communication is focusing on a person’s face and 
maintaining eye contact. There six expressions most 
displayed which are fear, anger, disgust, sadness, happiness 
and surprise [10]. Moghaddam (2002) investigated nonlinear 
SVMs for gender classification with low-resolution thumbnail 
faces, and demonstrated the superior performance of SVMs to 
other classifiers [14]. Navarro (2007) et al. stated that hands 
can be moved out of sight to the lower half of the speaker’s 
body to show deception, a rubbing of one’s hands together 
can display closed characteristics, nervousness. 

There are Hierarchical models which allow us to combine the 
benefits of part-based approaches and the multiple parts 
approach. These methods read the whole person at the root 
and individual body part at the leaves. Wang et al. [20] stated 
that inference is performed according to hierarchical Poselets. 
It is performed using the basics of the Pictorial Structure 
model. It lead to increase in the performance and was made 
more cost efficient than the earlier models. Fischler and 
Elschlager [11], Felzenszwalb and Huttenlocher [12], [13] et 
at. Proposed to use the Pictorial Structure Model (PSM) 
respectively, it provides us a framework for the deformable 
object detection and poses estimation. Sapp et al. [17] stated 
that when a model is represented using a convex combination 
of the tree-structured graphs which is linked with the dual 
variables, and the solution is done with dual decomposition 
algorithm. Zuffi et al. [21] stated that in a situation where 
poses in two different frames are coupled using optical flow. 
Even-though the method provided sound results but it had a 
drawback of frame-to-frame refinements. 
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III. PROPOSED WORK 

Our work relies on the pose estimation [1]; we study the body 
pose of the targeted person to know the body postures at 
different instances. After studying the pose of a person we use 
the following algorithm to estimate the emotion of a person. 
Using this algorithm we got approximate idea about the 
mental state of a person. 

The algorithm consists of six steps: first, divide a given video 
into frames so that we can study the body posture in different 
frames. Second, pose estimation [1] we get the stick pose of 
upper part of the body namely shoulders left and right hands 
and mid torso using, 

C(I; p) := X U EV (Pud; P U ) + X (u;v)EE Vu;v(P U “ p V ) (D 

Where, cost C(I; p) for a pose p and an image I where cpu(I; 
pu) is an appearance term for the body part u at the position p u 
in I, and vi/ U ; V (p U- p v ) is a deformation cost for body parts (u; v). 

A video sequence I = (Ii; I 2 ; It ); it is common to introduce 

temporal links between frames, impose temporal consistency 

in the estimation of the pose positions pi; p 2 ; ; p T . This 

is achieved by adding a temporal edge between every pair of 
nodes p u t and p u t+ i, 

C(I t ; p t ) + ELI cat: pt) + /,,0 (p t ; p t+1 ; I t ; I t+1 ) (2) 


I. RESULTS 

The system is proposed to detect the emotion from the body 
parts i.e. shoulder and hand gestures. We analyzed the 
shoulder in the normal pose and tried to find out the slopes of 
the shoulder lines and using slope we evaluated the angel 
between them. The angle found is used as the threshold to 
determine the mood and state of mind of the person. 



Fig 3 a) Shows a person in normal and calm posture, b) Shows 
a person in confused or amazed state . c) Shows a person in 
leaning and is depressed or not interested pose. 


where 0 is a consistency term between the poses in two 
consecutive frames and ki is a regularization parameter. We 
measure the consistency between p t and p t+i by comparing p t+1 
with pt adjusted with optical flow as follows: 

0 ( Pt ; p t+1 ; I t ; I t+1 ) = X u e v HpV - p u t - «p u t ) II (3) 

Where f t (p u t ) is the optical flow between frames I t and I t+ i 
evaluated at the position put . Indeed, this approach is quite 
natural and similar formulations have been proposed [28]. 
Third, we remove the world details to get the stick pose that 
we need for further steps. Next, we study the shoulder, firstly 
we study the shoulder of a person standing or sitting in normal 
pose by calculating their slopes, 



Fig 4 Shows a rgb image used as input and we get a stick pose 
of the person [28]. 


d[ — (b n b m ) / (a n a m ) (4) 

where a denotes the slope of the lines representing shoulder 
and 4 a’ and ‘b’ are the coordinates of the lines representing x 
and y axis respectively. Now we calculate 0 using the 
mathematical formula, 

0= tan' 1 |(a r a 2 ) / (l+aia 2 )l. (5) 

When we calculate value of 0 for the shoulders if the 
calculated value is grater then 0 then the person is amazed or 
confused and if the value is less then 0 then the person is lazy 
or not interested in the conversation. Similarly, in the next 
step we study the hand movement if value of 0 is more then the 
desired threshold the hand movement is considered as not 
harmful. If the value of 0 is less then the desired threshold, we 
check if the hand is accompanied with a fist or a flat hand, if 
fist or flat hand is found then the person could cause some 
harm. Finally we conclude that using the results of both hand 
and shoulder we can predict the emotional state of person. 



Fig 5 Removing the background and study the stick pose of 
shoulder. 
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In figure 4 we show the stick pose of the shoulder which is 
extracted from the fig 5. Firstly we removed the background 
and the unwanted sticks used for depicting various parts of 
upper body. Then we find the coordinates of the left and right 
shoulder respectively. 

Results 

73 

72 

71 

70 



0 2 4 & 8 10 


Calm/normal — ♦ — Confused /Amazed 

♦ Lazy/ not interested 

Fig 6 Graph shows the comparison between calm, confused 
and lazy pose. 

Graph shows the comparison between the three poses. When a 
person is confused and amazed the value is grated then 
threshold i.e. greater than normal pose, and when a person is 
depressed or lazy the values is less than the threshold. Using 
these results we can infer the approximate state of mind and 
mood of a person. X axis shows the number of results and Y 
axis shows the angle between the lines. 

The results found were based on fig 5, we have studied 
different images and marked a threshold based on normal 
human pose. For better results we can study the subject at that 
instant and find out the emotional state of a person. 


Table2 

Results according to Algorithm 


Hand Shoulder 

Confusi 
on and 
Doubt 

Cal 

m 

Nor 

mal 

Lazy 

Not- 

interested 

Anger/Fight 

X 

Yes 

X 

Hitting/Tapping 

X 

Yes 

X 

Normal Gesture 

Yes 

Yes 

Yes 


The above table we can infer different emotions gestures that 
can be found using hand (H) and shoulders (S). If the hand is 
accompanied with the fist he will be standing in a normal pose 
and this shows sign of anger. A person standing in anger is 
neither confuse or lazy. If the person’s hand is accompanied 
with a flat hand he could be hitting or tapping another person. 
If a person doesn’t have fist or flat hand and he is standing in 
normal position i.e. hands are aligned with the torso, then the 
person could be calm, confused or may be lazy. The results 
shows the approximate results for emotions that are been 
detected by body gesture. Along with the facial expressions if 
we study and use body gestures to seek emotional state of 
mind we can get the correct, precise and sound results. 


IV. CONCLUSION AND FUTURE WORK 

We presented a novel algorithm for predicting and estimating 
the emotional mind set of person based on bodily movements. 
This algorithm uses body pose estimation module to get the 
stick pose of human posture. Our approach is divided into two 
parts: 1) we study the Shoulders movements for different 
gestures and 2) we study the upper and lower part of hand 
gestures along with fist. Using the human posture we analyze 
and predict the human nature using the proposed algorithm. 
According to the results generated we found that without 
facial expression it’s hard to get the exact state of mind of a 
person, using the results we are predicting the approximate 
mood and emotional behavior of the person. We already have 
frameworks studying facial expressions but not for body 
gestures, according to psychological study every part of the 
body shows emotion and we are working on just facial 
features. In the near future we can develop and prepare a 
framework for studying the body gestures so that we can read 
different behavior and emotional states using bodily gestures. 
Secondly, we can study different hand gestures so that it can 
be used in robotics and medical fields. Along with the facial 
expressions if we study and use body gestures to seek 
emotional state of mind we can get the correct, precise and 
sound results. 
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